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Abstract 

As drought is among the natural hazards which affects people and economies worldwide 
and often results in huge monetary losses sophisticated methods for drought monitoring 
and decision making are needed. Several different approaches to quantify drought have 
been developed during past decades. However, most of these drought indices suffer from 
different shortcomings and do not account for the multiple driving factors which promote 
drought conditions and their inter-dependencies. We provide a novel methodology for the 
calculation of (multivariate) drought indices, which combines the advantages of existing 
approaches and omits their disadvantages. Moreover, our approach benefits from the 
flexibility of vine copulas in modeling multivariate non-Gaussian inter-variable dependence 
structures. A three-variate data example is used in order to investigate drought conditions 
in Europe and to illustrate and reason the different modeling steps. The data analysis 
shows the appropriateness of the described methodology. Comparison to well-established 
drought indices shows the benefits of our multivariate approach. The validity of the new 
methodology is verified by comparing the spatial extent of historic drought events based 
on different drought indices. Further, we show that the assumption of non-Gaussian 
dependence structures is well-grounded in this real-world application. 

Keywords: standardized drought indices, dependence modeling, drought modeling, vine 
copulas 
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1 Introduction 


The challenging held of drought research has a long history. Scientists of different disci¬ 
plines described and dehned different drought concepts and tried to measure, quantify and 
predict drought events and their impacts. There exist several review papers trying to de¬ 
pict/portray the state of the art and different developments in drought modeling. One of 


the most recent and comprehensive ones is the review of drought concepts by Mishra and 


Singh (2010). They state that “drought is best characterized by multiple climatological 
and hydrological parameters”. Different drought types like meteorological drought (lack of 
precipitation), hydrological drought (declining water resources), agricultural drought (lack 
of soil moisture), socio-economic drought (excess demand for economic good(s) due to 
shortfall in water supply) or ground water drought (decrease in groundwater recharge, 
levels and discharge) are driven by different variables/phenomena. Recently, there have 


been several attempts to develop multivariate drought indicators (see e.g. |Kao and Govin- 

daraju 

2010 Hao and AghaKouchak, 2013, 

2014 

Farahmand and AghaKouchak 

2015), 


combining at least two different variables. Subsequently, we motivate and present a statis¬ 
tically sound approach for the calculation of standardized uni- and multivariate drought 
indices for arbitrary (sets of) drought relevant variables. The multivariate indices use so 
called vine copulas to flexibly model the variable dependencies. 


Copulas are explained best by Sklar’s Theorem (Sklar, 1959). Let F be a multivariate 
(d-dimensional) distribution function and Fi,... ,Fd the corresponding marginals. Then 
there exists a copula C, such that F{x) = C (Fi(xi),..., Fd^xa)), where x = (xi,..., Xd)' 
is the realization of a (continuous) random vector X G M'^. A copula itself is a d- 
dimensional distribution function on the unit hypercube with uniformly distributed mar¬ 
gins. It captures all dependency information between the marginals of the corresponding 
multivariate distribution function. Vine copulas are d-dimensional copula constructions 


built on bivariate copulas only (see Aas et ah, 2009; Difimann et ah, 2013). They allow 


very flexible modeling of non-Gaussian, asymmetric dependency structures due to their 
modularity. 

The most popular drought indices are the Palmer Drought Severity Index (PDSI) 
(Palmer, 1965) respectively its self-calibrating version (SC-PDSI) (Wells et al.| 2004) and 


the Standardized Precipitation Index (SPI) (McKee et ah, 1993 [Edwards and McKee 


1997). Drought indices in general should quantify deviations from normal conditions, 
i.e. they should take seasonality into account. Often negative/small values reflect dry 
conditions and positive/high values wet conditions. They usually reguire long data records 
to yield meaningful results. 

The PDSI is calculated based on precipitation and temperature and assumes a sim¬ 
plifying water balance model (for details see Palmer, 1965). The major criticisms on the 


PDSI are its lack of applicability and comparability for different climatic regions. Some 
of its major shortcomings vanished with the SC-PDSI, whose parameters are determined 
based on local climatic conditions rather than on some hxed locations in the US, i.e. 
it allows for spatial comparison. One further criticism of the PDSI is its autoregressive 
structure. Present conditions depend on past conditions, however the time interval which 
influences the present varies across space but cannot be accessed from the model. 


In contrast to the PDSI, other drought indices like the SPI (McKee et ah, 1993 Ed- 
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wards and McKee, 1997) are of probabilistic nature. This allows risk analysis, classification 
and frequency analysis of drought events. Two advantages of the purely precipitation 
based SPI over the PDSI are its standardization (standard normal distribution of SPI 
values) and the concept of time scales, which allows to set the time interval which has 
an influence on the present (drought) conditions. The SPI methodology can be applied 
to other variables as well (see e.g. the Standardized Runoff Index of Shukla and Wood 


2008) and the standardization allows for comparison of such standardized indices and 


across space and time. A criticism is that the SPI assumes a parametric distribution to 
model the data. However, a good £t to the data (especially in the distribution tails) 
is never guaranteed and in fact is not possible for many locations (e.g. in the Sahara). 
Moreover, temporal dependencies in the data or those introduced through the time scale 
cause the fitting to be biased. 

As an enhancement of the SPI the Standardized Precipitation Evapotranspiration 
Index (SPEI) (Vicente-Serrano et ah, 2010|| ) quantihes drought based on multivariate 
input. Instead of precipitation a climatic water balance (precipitation minus potential 
evapotranspiration) is considered to quantify dry/wet conditions. The SPEI allows for 
trends in the time series data such that these are passed on to the index (to include effects 
of climate change). 


Kao and Govindaraju (2010) present a (to our knowledge) hrst multivariate copula- 
based drought index, the Joint Dehcit Index (JDI). They apply it to precipitation and 
streamflow time-series, but application to other variables is possible. Marginals are mod¬ 
eled using the SPI approach. Empirical copulas are used to (non-parametrically) estimate 
the dependence structure of the marginals representing the different time scales of one 
to twelve months. Finally, the joint deficit index combines the drought information cap¬ 
tured by different time scales using the Kendall distribution function to assess the joint 
probability. The results are transformed to a standard normal distribution. Note, that 
for meaningful estimation of empirical copulas long data records are required. 


Farahmand and AghaKouchak (2015) introduce the Standardized Drought Analy¬ 


sis Toolbox (SDAT), with the aim to provide a generalized approach to derive non- 
parametric standardized drought indices. Based on precipitation and soil moisture time 
series, they present a multivariate approach to drought modeling. Enhancing the SPI idea 
to bivariate data (based on non-parametric estimation), a bivariate empirical distribution 
is htted to the input data and the joint cumulative probability is transformed with the 
inverse CDF of a standard normal distribution. Note however, that this approach doesn’t 
yield a real standardization. Usually negative values of the proposed index are more 
probable, since the joint cumulative probability is not uniformly distributed on [0,1]. 

Summarizing the lessons learned from the sophisticated drought indices revised above, 
we state that (univariate) drought indices should ... 


PROBAB be probabilistic (allow risk/frequency analysis and classihcation of drought events), 
i.e. no assumptions about the characteristics of the underlying system have to 
be made. 


ARBVAR be applicable to arbitrary drought relevant variables. 
DRYWET be negative/positive to indicate dry/wet conditions. 
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Table 1: Comparison of different drought indices (SC-PDSI, SPI/SPEI, JDI, SDAT) and 
their properties: + has this property, — doesn’t have this property, ? no dehnite answer 
possible or not applicable (e.g. because the corresponding model is not probabilistic). 



SC-PDSI 

SPI/SPEI 

JDI 

SDAT 

PROBAB 

— 

+ 

+ 

+ 

ARBVAR 

— 

? 

+ 

+ 

DRYWET 

+ 

+ 

+ 

+ 

SMALLS 

— 

— 

— 

— 

TRENDS 

+ 

+ 

+ 

+ 

SEASON 

+ 

+ 

+ 

— 

TIMDEP 

? 

— 

+ 

— 

NPDIST 

? 

— 

— 

+ 

STCOMP 

? 

+ 

+ 

— 

TSCALE 

— 

+ 

— 

+ 

MULTEX 

— 

? 

+ 

+ 


SMALLS yield meaningful results for (monthly) data records for 10 years and more (i.e. 
minimum sample size = 120). 

TRENDS reflect trends in the input data. 

SEASON model and eliminate seasonality. 

TIMDEP model and eliminate temporal dependencies before a probability distribution is 
htted. 

NPDIST use non-parametric distribution estimates for the (transformed) underlying vari¬ 
able (better £t, computationally efficient). 

STCOMP be standardized to enable comparison over space/time and with other indices. 

TSCALE allow for computation/aggregation at different time scales 1. 

MULTEX be extendable to multivariate input (different types of drought). 

Table summarizes different drought indices and lists which characteristics they fulhll. 
Subsequently, we introduce a novel approach to drought modeling which addresses the 
above criteria step by step. 


2 Data 

For the purpose of application and illustration we utilize the publicly available Climatic 
Research Unit (CRU) time series (TS) data (version 3.21, see Jones and Harris, 2013), 
which is monthly climatic data on a 0.5° x 0.5° (longitude x latitude) grid. We restrict 
this (model-calculated) data set to the area (11°W, 32°E) x (35°iV, 71°iV) covering most of 
Europe (see gray shaded area in Figure [^. This results in data for S = 3380 grid cells. For 
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the calculation of drought indices we use the variables potential evapotranspiration (PET), 
precipitation (PRE) and vapor pressure deficit (VPD) for the years 1961 to 2010 (T = 600 
months). VPD is calculated based on mean temperature (TMP) and vapor pressure (VAP) as 
VPD = SVP —VAP, where SVP = 6.1078- jg saturated vapor pressure 

(see Murray, 1967). The hve pixels C, N, E, SE and SW highlighted in Figureare used 
for subsequent illustrations. Their coordinates are provided in the hgure. Time series 
plots corresponding to these hve locations of the variables PET, PRE, VPD as well as SPl 
and SPEl are provided in the supporting information (see Section 7.1). 



c 

N 

E 

SE 

SW 


Figure 1: Study area and location of pixels used for illustration. 


3 Univariate standardized indices 

In a hrst step we seek to develop and illustrate a statistically sound (PROBAB) generalized 
modeling framework for (univariate) standardized indices. These indices should have the 
properties which were discussed in the introduction (Section [^. 

3.1 Variable transformation 

Let us now consider a time series Xt,,, k = 1,... ,T, for an arbitrary drought relevant 
variable (ARBVAR). Small values should always indicate dry and big values wet conditions 
(DRYWET). To ensure that, we change the sign of the time series beforehand if it was the 
other way round. Consider for instance potential evapotranspiration (PET). A high value 
corresponds to potentially high evaporation and transpiration, i.e. to dry conditions. For 
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low values the opposite is observed. Therefore we need to multiply the PET (and also the 
VPD) time series by —1. 

Subsequent steps include a month-wise standardization of the time series. Hence, it 
is preferable that the distribution of the time series for each month is not skewed. To 
achieve that, we consider monotone and continuous transformations. Figure shows the 
spatial variation of skewness for the month-wise time series of vapor pressure dehcit (VPD). 
We observe negative and positive skewness and a variation over the year, which supports 
a month-wise modeling approach. 
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Figure 2: Empirical skewness estimates for the month-wise vapor pressure dehcit (VPD) 
time series. 


Let each time point fc = 1,..., T, be a 2-tupel (m^, yk), where nik G {!,..., 12} 
(1 = January, ..., 12 = December) represents the month and the integer t/*, G Z the year 
corresponding to Then we consider the month-wise time series x^, ■= {xt,.)k£ic(m) = 
{x(rn,yk)y ^ ^ ni = 1,..., 12, wliere the index set for month m is dehned as /C(m) := 


{k : mk = m}. 

To eliminate/reduce skewness in the (12) month-wise time series Xm, m = 1,..., 12, 
we apply power transformations. An appropriate family of transformations, similar to the 


famous Box-Cox transformations, which is dehned not only for positive values is the |Yeo 
and Johnson (2000) transformation -0 : M x M —)■ M, dehned as 


" ((x -f 1)^ — l) /A if X > 0, A 7^ 0 

ln(x -F 1) if X > 0, A = 0 

-((-x + 1)2-^-1)/(2-A) ifx<0,A^2 

^ — ln(— X -f 1) if X < 0, A = 2. 
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Figure 3: Yeo and Johnson transformation parameter A for the month-wise VPD time 
series. 


Figure maps the Yeo and Johnson transformation parameter A for the month-wise 
VPD time series. The observed spatial paterns resemble those observed for skewness in 
Figure 

3.2 Elimination of seasonality 

Often (climatic) variables are subject to seasonal fluctuations (see e.g. PET, Figure]^. 
Moreover, they can be subject to trends (e.g. due to climate change). TRENDS are not 
removed since a drought index should be able to detect changes in drought frequency and 
intensity due to climate change. Since drought is considered as a (negative) deviation 
from ‘normal’ conditions (anomaly), we remove seasonality (SEASON). This is accounted 
for by month-wise modeling of the time series Xt,., k = 1,... ,T. However, to ensure that 
the sample size (SMALLS) for fitting a distribution is not too small, our deseasonalization 
procedure allows to recompose the resulting anomalies to a single time series. 

To eliminate seasonality, we model the month-wise mean separately for each of the 
12 time series x^, m = 1,... ,12. We estimate it as 

\r(mM ^{m,yk)y m = 1,. .. , 12. (1) 

kGKim) 

Figure 1^ illustrates the month-wise modeling ([^ of potential evapotranspiration (PET). 
Least-squares estimation ensures that X]fceyc(m) “ Fm) = 0 for all m = 1,..., 12. 

Thus also the anomalies := k = 1,...,T, are centered around 0 (i.e. 

'Ylik=i “ 0)- Hence, seasonal deviations from the annual mean could be eliminated. 
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Figure 4: Month-wise modeling of PET; The original time series (black, 1991-2010, for 
pixel C as given in Figure is superimposed by the corresponding month-wise mean £t 
(grey, cp. Equation 0 )- The month-wise time series are illustrated by points colored 
differently for each month. The modeled month-wise means are visualized by lines in the 
corresponding color. 


Also the variance of the time series may be subject to seasonality, i.e. in some months 
the time series may deviate more from its mean compared to other months. The color¬ 
coding in Figure [^reveals inhomogeneity of the variance. To quantify this seasonal hetero¬ 
geneity of the time series at^, k = 1,... ,T, we estimate month-wise standard deviations 
as 

V ' ^ k&K.{ra) 

where | ■ | is the cardinality. To obtain a homogenized time series we compute the stan¬ 
dardized anomalies (residuals) rt^ := fc = 1,... ,T. 

3.3 Elimination of temporal dependencies 

Apart from seasonality, time series often feature temporal dependence (TIMDEP). Such 
serial dependencies can be captured by autoregressive moving-average models (see e.g. 

)). For a (deseasonalized, homogeneous, zero-mean) time series rt^., k = 
1,... ,T, the autoregressive moving-average model ARMA(p, q) with AR-order p G No 


Box et ah (2008 
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and MA-order g G Nq is defined as 

p 1 

j=i i=i 

where the error terms St^. are i.i.d. A^(0,(T^) distribnted. Note, that for p or q eqnal to 
0 the corresponding snmmands are neglected. For adeqnate choice of the orders p and q 
and estimates (f)j, j = 1,... ,p, and 9j, j = 1,... ,q, of the corresponding parameters the 
model residnals := rt^ - = 1,..., T, are approximately 

temporally independent. For the variables at hand (PET, PRE, VPD) p = 1 and g = 0 are 
an adeqnate choice. 


3.4 Transformation to standard normal distribution 

As the assnmption of established standardized dronght indices like SPI and SPEI of a 
parametric distribntion model for the data performs bad, it seems appropriate to nse the 
(non-parametric) empirical distribntion (NPDIST) fnnction Ft{x) := ^ of 

the data respectively the residnals e^, k = 1,..., T, resnlting from the previons modeling 
step. Here is the indicator function, which equals 1 if the event A is true and 

0 otherwise. Note that for fitting a distribution (no matter if parametric or not) to a 
sample A; = 1,..., T, it is a critical assumption that the sample originates from the 
same distribution and is i.i.d. We ensured the i.i.d. assumption in the previous step by 
eliminating the temporal dependencies. 

We use the estimated distribution Ft to transform our residuals k = 1,... ,T, 
to the u-scale, i.e. to be (approximately) uniformly distributed on the interval [0,1]. 
This transformation is called probability integral transform (PIT). We calculate Ut^, '.= 
T/{T + l)Fr(etJ = rank(eiJ/(T + 1), /c = 1,..., T. We multiply by T/(T + 1) to 
avoid any Ut^ = 1. Further, we transform to the z-scale, calculating Zt^, := 
k = 1,... ,T, using the inverse PIT based on the CDF d) of a standard normal distribution. 
It holds that zt^, k = 1,... ,T, is (approximately) independent and identically standard 
normal distributed (STCOMP). 


3.5 Standardized indices on different time scales 


McKee et ah (1993) introduced the concept of time scales (TSCALE) to make their drought 


index (the SPI) applicable to different types of drought. We adopt this concept, however 
we perform the temporal aggregation in the end of the above described modeling process, 
in order not to violate the independency assumption for fitting a probability distribution 
to the residuals. This has also the advantage of being computationally more efficient. We 


need to perform the different modeling steps of Sections 3T 34 only once, after that we 
are able to calculate the index on arbitrary time scales. 

The (approximately) temporally independent standard normal distributed time series 


Zu, k = 1, 


, T, from above is already a standardized index with time scale / = 1. 


The normal distribution has the advantage that a sum of independent normal distributed 
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Table 2: Dryness and wetness categories. 



category 

cnmnlative 

probability 

qnantile 

WA 

exceptionally wet 

0.98-1.00 

+2.05 < SI < +CX) 

hF3 

extremely wet 

0.95-0.98 

+1.64 < SI < +2.05 

W2 

severely wet 

0.90-0.95 

+1.28 < SI < +1.64 

W1 

moderately wet 

0.80-0.90 

+0.84 < SI < +1.28 

WO 

abnormally wet 

0.70-0.80 

+0.52 < SI < +0.84 

DO 

abnormally dry 

0.20-0.30 

-0.84 < SI < -0.52 

D1 

moderately dry 

0.10-0.20 

-1.28 < SI < -0.84 

D2 

severely dry 

0.05-0.10 

-1.64 < SI < -1.28 

D3 

extremely dry 

0.02-0.05 

-2.05 < SI < -1.64 

DA 

exceptionally dry 

0.00-0.02 

-cx)< SI < -2.05 


random variables is again normally distribnted. We nse this property to calcnlate stan¬ 
dardized indices for time scales I > 1. The snm ^tk+i-j standard normal variables 
is normally distribnted with mean 0 and variance 1. Hence, we obtain a standardized index 
with time scale I as Sli(tk) ■= ^ Y!j=i k = 1,... ,T. 

To classify the valnes of standardized indices we nse the dryness/wetness categories 
dehned in Table based on qnantiles (cp. Svoboda et al-f 2002). A comparison of 


as 


precipitation (PRE) based dronght indices for different time scales is provided by Fignre 
For the selected location we identify persistent dry periods dnring the years 1976, 
1989 — 1991, 1992 — 1993 and 2003 — 2004. Whereas the index with time scale 1 identihes 
single (agricnltnral) dronght months, higher time scales (e.g. 6, 12) allow to identify 
persistent periods of dryness (hydrological dronght). 


4 Multivariate standardized indices 


Snbseqnently, we provide an extension of the methodology introdnced in Section to mnl- 
tivariate standardized dronght indices (MULTEX). This extension is based on vine copnlas 
(see Aas et ah, 2009) nsed for dependency modeling of the involved variables. The de¬ 


pendence parameters will be estimated nsing a semi-parametric estimation procednre (see 
1995[). Other copnla based dronght indices were introdnced by Farahmand 


Genest et al. 


and AghaKonchak (2015) and Kao and Govindarajn (2010) 


4.1 Marginal models 

As copnlas allow separate modeling of margins and dependence strnctnre, we hrst model 


the margins according to Sections 3.1 3.4 as in the nnivariate case. We transform the 


inpnt data (see Section 3.1), then we eliminate seasonality (see Section 3.2) and temporal 
dependencies (see Section 3.3) and estimate the distribntion of the remaining residnals 
non-parametrically (see Section 3.4). This enables transformation to the n-scale (copnla 


data) and after that copnla based dependency modeling. 
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Figure 5: Time series (1975 - 2004, for pixel C Figure of standardized drought index 
(SI) based on PRE, for time scales / = 1, 3,6 and 12, repectively. The color-coding reflects 
the severity of wetness/dryness according to the different categories specihed in Table 
1^ For better identihcation of dry/wet periods points at the top/bottom of the panels 
(colored accordingly) indicate points in time of wet/dry conditions. 


4.2 Vine copula based dependency modeling 

Let now u := {ui, ..., Ud) be the copula data obtained from the marginal models corre¬ 
sponding to d different drought relevant variables, where Uj = (uj,4)^=1,. ..,r, j = ■ ,d, 

and Uj^tk is the copula data corresponding to variable j at time tk. In a second (paramet¬ 
ric) step we select and estimate a vine copula C for this data. We illustrate this procedure 
based on a d = 3 dimensional example. For a more general explanation of vine copulas 
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see Aas et al. (2009) and DiBmann et al. (2013). 
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PET,PRE;VPD ^ - 
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Figure 6 : Selected vine tree structure. 


Let now d = 3 and 1 = VPD, 2 = PET and 3 = PRE. Generally, the structure of 
vine copulas is organized using a nested set of trees (graphs) fulhlling certain conditions. 
The edges of these trees correspond to bivariate copulas which are the building blocks 
of the vine copula. Selecting the tree structure as given in Figure we explicitly model 
the bivariate dependence structures (copulas) Ci^ 2 -i (tree Ti) and C 2 , 3 -i (tree T 2 ) for 
the variable pairs (VPD,PET), (VPD,PRE) and (PET,PRE) given VPD, respectively. Here C' 2 , 3 ;i 
denotes the pair-copula associated with the conditional distribution of the variable pair 
(2, 3) given variable 1. Further, we select pair-copula families for the pairs above and 
denote their parameters as 6 := ( 6 * 1 ^ 2 , 6 ^ 2 , sp). Then the vine copula density c is given 

as 


c{ui,U 2 i M3; 6 ) — Ci^2(mi, 112; (^1,2) ■ Cl,3(1115 Us; ^1,3) 

■ C2,3;l(h2|l(M2, Mi; ^ 1 , 2)5 ^3|1 (HS 5 Hi) ^ 1 , 2 ); 6^2,3 ;!)5 

where ci ,2 5 ci ,3 and C 2 , 3 ;i are the pair-copula densities corresponding to the copulas Gi ,2 5 
Gi _3 and (^ 2 , 3 ;!. The involved h-functions are dehned as hb\a{uiy,Ua]6) := C^a{ul,\ua]0), 
where C}y\a denotes the conditional distribution function of Ub given Ua- The tree structure 
can be saved in a triangular, so called R-vine matrix. For the given three dimensional 
example a valid R-vine matrix is given as 

/3 0 0\ /PRE 0 0 \ 

I 2 2 0 j respectively j PET PET 0 j . 

\1 1 1/ \VPD VPD VPD/ 


Whereas the second column encodes the pair (VPD,PET), the hrst column contains the 
pairs (VPD,PRE) and (PET,PRE;VPD). Other orders of these variables are possible. For a 
comparison of different orders see the supporting information (Section |7.1[ ). 

For the pair-copula family selection we can choose among a variety of bivariate copula 
families, amongst others among the Gaussian (N), Student-t (t), Glayton (G), Gumbel 
(G) Frank (F) and Joe (J) family, which all feature different dependence structures and 
properties. Also rotated versions of the Glayton, Gumbel and Joe copula are considered 
to capture negative asymmetric dependencies. The pair-copulas are selected separately 
(according to the BIG) starting in tree Ti. Their parameters are estimated at the same 
time using maximum likelihood estimation. Before that a bivariate independence test 
(Genest and Favre, 2007) can be performed, to see if an independence copula should be 
selected. For more details on different (rotated) copula families and their selection we 
refer to Brechmann and Schepsmeier (2013). 

Figure visualizes for all spatial pixels under consideration which dependence struc¬ 
tures were selected for the vine tree structure specihed above (Figure [^. For the pair 
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VPD,PRE PET,PRE;VPD 



Figure 7: Spatial variation in the pair-copula families selected for the pairs specihed in 
Figure 

(VPD,PET) the elliptical and symmetric Gaussian (N) and Student-t (t) copula were se¬ 
lected most over Europe. Where the Student-f copula was selected extreme high or low 
VPD and PET anomalies occur jointly, since the Student-f copula allows for dependence 
in the upper and lower distribution tails, so called tail-dependence. For a large area on 
the Iberian Peninsula (G) extreme wet conditions for both variable pairs seem to occur 
simultaneously, since the Gumbel copula allows for upper tail dependence. For most of 
Scandinavia (SG) we observe the opposite, high correlation of extreme dry conditions, 
since the survival/180° rotated Gumbel copula allows for lower tail dependence. For the 
other two (conditioned) pairs similar interpretations can be made. We observe that for 
most pixels non-Gaussian dependence structures were selected. 


4.3 Computation of multivariate indices 


Based on the previously selected vine copula C for the data u = (iti,..., we transform 
u to i.i.d. uniform data on [0,1], using the so called Rosenblatt (195^ transformation, a 
multivariate probability integral transform. The Rosenblatt transform v := {vi,... ,Vd) 
of u is dehned as 


G,ife 

^ 2,4 := C2\l{u2,t^:\Ul,tk), 


'^d,tk ■ ,...,, k . ,T 

where is the conditional cumulative distribution function for variable j given 

the variables 1,... ,j — 1, for all j = 2,..., d. For vine copulas the order of the variables 
is determined by the vine tree structure respectively the R-vine matrix. For details on 
the computation of the Rosenblatt transform for vine copulas see Schepsmeier (2015). 

Generally speaking, application of the Rosenblatt transform to our d dependent vari¬ 
ables yields independent information about dry/wet conditions captured in these vari- 
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ables. Vi incorporates the same information as an univariate drought index calculated 
according to Section based on variable 1. Vj, j = 2,... ,d, provide information on 
dry/wet conditions identihed by variable j, conditioned on the dryness/wetness informa¬ 
tion provided by the previously considered variables 1,... — 1. 

For our three dimensional example from above we compute fvpD,t = Wvpd.o which 
represents the dry-/wetness information captured in the variable VPD for time point t. 
VpET,t = C'pET|vpD(wpET,t|wvpD,t) provides additional information based on PET knowing about 
VPD in that particular time point t. Its calculation involves the pair-copula Cvpd.pet- The 
calculation of Upre.j = C'pRE|vpD,PET(wpRE,t|MvpD,t, ^pet,*) is a bit more involved. We calculate 
'VpRE,t = C'pRE|PET;vpD(C'pRE|vpD(wpRE,t|MvpD,t)|C'pET|vpD(wpET,t|wvpD,t)), based ou the palr-copulas 

CpRE,PET;VPD5 C'vPD.PRE ^ud C'vPD.PET- 

Subsequently we consider two different approaches to join this multivariate drought 
information into one index. For comparison, we provide a third approach which assumes 
multivariate normality. 


Method A (aggregation) This approach allows for a weighting with weights w = 
{wi ,..., Wd), Wj > 0, for the different variables j = 1,... ,d- We calculate the standardized 
multivariate index (SMI) with time scale I as 

l d 

K4+1-, 

2=1 j = l 


SMIj^(m;l,...,d)(4) := 


Vi 


w 


Method Ai (multiplication) For the second approach we exploit that the multivari¬ 
ate dependence structure of = (rip,... ,Vd) is represented by the independence copula 
Cn('yi, ... ,Vd) = ni=i '^j- Hence, we calculate Vt^ := ni=i k = 1,... ,T. To obtain 


a standardized (multivariate) index we proceed as in the univariate case (see Sections 3.4 
and 3.5). We calculate the rank transformation Ut,. := renakivt^) /{T + 1), k = 1,... ,T, 


transform to the z-scale and calculate the SMI with time scale I as 


SMI"(1; 1,..., <i)(4) ^ , 

v' i=i 


where no weighting is allowed, i.e. m = 1 := (1,..., 1). 


Method (normal) Let 2 be the marginal transformation of u to the z-scale and con¬ 
sider a vector of weights w = {wi, ..., Wd), Wj > 0. Assuming 2 to be a sample from a zero 
mean multivariate normal distribution, we can conclude that the linear transformation 
w'z is a sample from a zero mean univariate normal distribution. We estimate the sample 

variance of w'z by S := calculate a (weighted) SMI 

with time scale I as 

.. l d 

SMif («,; 1,,,, ,d)(tO - 7 ^ ■ 

V j=i 
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Table 3: Maximum spatial extent of drought events classified as extreme (-D3) or excep- 
tional (T>4) according to SPIg, SPEIg, SMI^(1; VPD, PET, PEE) and SMI^(1; VPD, PET, PRE). 




univariate 



multivariate 



SPI 

6 

SPEIg 

SMI 

A 

SMI 

M 

event 

max. 

% area 

max. 

% area 

max. 

% area 

max. 

% area 

1976 

07.1976 

31.0% 

08.1976 

28.4% 

08.1976 

28.7% 

08.1976 

24.1% 

1990 

03.1990 

18.3% 

03.1990 

21.3% 

05.1990 

25.7% 

05.1990 

36.1% 

2003 

08.2003 

21.9% 

08.2003 

37.2% 

08.2003 

50.7% 

08.2003 

46.9% 


5 Application 


To measure pair-wise dependence we use the rank-based association measure Kendall’s r 
(see e.g. Kendall, 1970). In Figure|^we provide maps of Kendall’s r between the univariate 
drought indices Sl6(VPD), Sl6(PET) and Sl6(PRE) and the (multivariate) drought indices 
SMI^(1;VPD,PET,PRE), SMI^(1; VPD, PET, PRE), SMI^(1; VPD, PET, PRE), SPIg and SPEIg 
on time scale 6, to see how the different variables contribute to the different drought 
indices and how this contribution varies over space. Whereas the SMI'^ is dominated 
by PET and VPD (high Kendall’s r values all over Europe for the pairs (SMIN,SIVPD) 
and (SMIN,SIPET)), the other indices are stronger associated with PRE (comparatively 
high Kendall’s r values for the pairs (SMIA,SIPRE), (SMIM,SIPRE), (SPI,SIPRE) and 
(SPEI,SIPRE)). For SMI"^ and SMI"^ the overall association with PET and VPD is stronger 
compared to SPI and SPEI (compare the corresponding pairs). Especially for SPI and 
SPEI we observe spatial differences in Kendall’s r (see all pairs involving SPI and SPEI). 

To validate and compare the different drought indices we consider the three major 
drought events of the 30 years period 1975 — 2004 which were observed in large parts of 
Europe. These droughts occured in the years 1976, 1989/90 and 2003. We summarize 
these events in Table It gives the dates when the drought events (in terms of an 
extreme {DA) or exceptional {DA) drought) reached their maximum spatial extent (i.e. 
the month in which the area affected by a 773 or DA drought reached it’s maximum) 
and the corresponding percentage of area under consideration which was affected by an 
extreme {DA) or exceptional {DA) drought. 


Figure compares time series of the percentage of area affected by drought accord¬ 
ing to the different univariate and multivariate drought indices calculated following the 
methodology described above, as well as SPIg and SPEIg. Comparing the univariate in¬ 
dices we see that those based on PET and VPD yield similar however not identical results. 
During the three major drought events in 1976, 1989/90 and 2003 all three univariate in¬ 
dices indicate extreme dry conditions for large parts of Europe. Comparison to the middle 
panel shows that the multivariate indices successfully combine the drought information 
captured in the single variables used for their calculation. For the years 1990 and 2003 
abnormally high PET and VPD aggravate the dry conditions due to a lack of precipitation. 
During the years 1994, 1995, 1999 and 2000 one can see that the vine copula based indices 
are more conservative compared to SMI'^, since they are not as much influenced by PET 
and VPD. In terms of spatial extent the multivariate indices classify the drought events of 
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Figure 8: Maps of Kendall’s r for all combinations of the univariate drought in¬ 
dices Sl6(VPD) (SIVPD), Sl6(PET) (SIPET) and Sl6(PRE) (SIPRE) with the indices 
SMI^(1;VPD,PET,PRE) (SMIN), SMI^(1; VPD, PET, PRE) (SMIA), SMI^(1; VPD, PET, PRE) 
(SMIM), SPIg (SPI) and SPEIg (SPEI). 


1990 and 2003 as more severe compared to SPI and SPEI. 


6 Conclusions and outlook 


Comparison of the advantages and disadvantages of existing drought indices and the 
flexibility of vine copulas in modeling multivariate dependence structures led to a novel 
and flexibly applicable approach to calculate drought indices based on arbitrary sets of 
drought relevant variables. This approach involves several well reasoned modeling steps 


which we summarize in Figure 10 


Taking several drought drivers and their dependencies into account at the same time 
our novel approach enables flexible modeling of different drought types and allows tailoring 
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Figure 9: Percentage of area affected by a D3 or DA drought according to Sl6(VPD), 
Sl6(PET) and Sl6(PRE) (upper panel), SMI^(1; VPD, PET, PRE), SMI^(1; VPD, PET, PRE) and 
SMI^(1; VPD, PET, PRE) (middle panel), and SPIg and SPEIg (lower panel). 


of drought indices to specihc applications. An example would be the application of the 
novel methodology in the held of ecology. Multivariate drought indices based on selected 
variables could be calibrated to tree ring data to hnd good models for the response of 
tree growth to climatic conditions. Moreover, the presented approach for the calculation 
of severity indices is not restricted to drought. Applications to model for example the 
degree of contamination of a water body due to different contaminants are feasible. 


7 Supporting information 

7.1 Accompanying figures and analyses 

We provide hgures and further analyses of the data at hand, visualizing/complementing 
the presented methodology for drought index calculation. We address the following issues: 
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Figure 10: Modeling steps for multivariate drought index calculation. 


1. Visualization of the data and its features 

2. Testing of multivariate normality 

3. Visualization of the area affected by drought according to the different indices 

4. Visualization of the inter-index association 

5. Visualization of drought index time series for selected locations 

6. Comparison of different variable orders for the calculation of multivariate drought 
indices 

7. Effect of trends on multivariate drought indices 


7.2 Software and data 

Moreover, we provide an R software package (SIndices, version 1.0) which is an implemen¬ 
tation of the presented methodology. It comes along with a detailed manual. Further, we 
provide the R-code which was used to produce all results presented in the article and the 
supporting information. The Climatic Research Unit (CRU) time series (TS) data (version 
3.21, see Jones and Harris, 2013) on which all examples and computations are based can be 
obtained from http://dx.doi.org/10.5285/D0E1585D-3417-485F-87AE-4FCECF10A992, 
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